34 PART 1 Getting Started with Biostatistics
But the idea of doing a census to calculate such a parameter is not practical. Even
if we somehow had a list of everyone in the city we could contact, it would be not
be feasible to visit all of them and measure their SBP. Nor would it be necessary.
Using inferential statistics, we could draw a sample from this population, measure
their SBPs, and calculate the mean as a sample statistic. Using this approach, we
could estimate the mean SBP of the population.
But drawing a sample that is representative of the background population depends
on probability (as well as other factors). In the following sections, we explain why
samples are valid but imperfect reflections of the population from which they’re
drawn. We also describe the basics of probability distributions. For a more exten-
sive discussion of sampling, see Chapter 6.
Recognizing that sampling isn’t perfect
As used in epidemiologic research, the terms population and sample can be defined
this way:»
» Population: All individuals in a defined target population. For example, this
may be all individuals in the United States living with a diagnosis of Type II
diabetes.»
» Sample: A subset of the target population actually selected to participate in a
study. For example, this could be patients in the United States living with
Type II diabetes who visit a particular clinic and meet other qualification
criteria for the study.
Any sample, no matter how carefully it is selected, is only an imperfect reflection
of the population. This is due to the unavoidable occurrence of random sampling
fluctuations called sampling error.
To illustrate sampling error, we obtained a data set containing the number of pri-
vate and public airports in each of the United States and the District of Columbia
in 2011 from Statista (available at https://www.statista.com/statistics/
185902/us-civil-and-joint-use-airports-2008/). We started by making a
histogram of the entire data set, which would be considered a census because it
contains the entire population of states. A histogram is a visualization to deter-
mine the distribution of numerical data, and is described more extensively in
Chapter 9. Here, we briefly summarize how to read a histogram:»
» A histogram looks like a bar chart. It is specifically crafted to display a
distribution.